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Document or message security arrangements 

The present invention relates to arrangements for the 
protection of documents against forgery or repudiation. The 
invention also relates to arrangements for the protection of 
electronically transmitted messages against forgery or 
5 repudiation- 
It is common nowadays to provide security to documents 
through the use of holograms, watermarks, personal signature, 
notary stamps and other physical means: these all increase the 
difficulty for making unauthorised imitations or changes; 

10 however, they all require physical inspection, often involving 
forensic equipment and expertise, in order to detect a 
counterfeit. It is also becoming increasingly necessary to 
provide security for electronically transmitted messages. 

The present invention provides for the security of the 

15 text of a document or message by cryptographic techniques. 

In accordance with the present invention, there is 
provided an apparatus which is arranged to process a selected 
part or selected parts of the text of a document or message to 
form a hash, the hash usually being of fewer characters than 

20 the selected part or parts of the text, the processing 
comprising retrieving numerical values which define the 
respective characters of the selected part or parts of the text 
and making a calculation using the numerical values of the 
successive characters. 

25 The apparatus may be arranged to receive or create a 

text in electronic form, then process this text to derive the 
hash of the selected part or parts of the text. The apparatus 
may further be arranged to add the hash to the text: 
typically, the apparatus then outputs the text, with the added 

30 hash, either for printing as a document or for electronic 
transmission. Alternatively the apparatus may be arranged to 
output the text and the hash separately (or store one and 
output the other) . 

The practical value of the hash is that it is sensitive 

35 to any change or alteration in the selected part of the text 
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from which it is derived: it is not feasible to make a desired 
alteration to that part of the text whilst preserving the same 
hash value. 

The hash thus forms a cryptographic signature which 
5 makes forgery detectable on the basis of an assessment of the 
content of the text and without the need for any forensic 
examination of the document. 

The hash algorithm is not applied to the whole text, 
only to a selected part, or to selected parts. The or each 
10 part is identified, or sealed, by predetermined characters or 
combinations or characters immediately preceding and 
immediately following it: for example, a series of tilde marks 
(") may be used. 

Preferably the numerical values of the respective 
15 characters of the selected text are their ASCII values: the 
characters preferably include all keystrokes (including space, 
return etc.); preferably the "alphabet" is restricted to all 
keystrokes having ASCII values in the range 32 to 12 5 inclusive 
and also including ASCII values for the "return". 
20 Preferably the processing is recursive, in that the 

calculation in respect of each character uses the result of the 
calculation made in respect of at least one previous character. 

Preferably the calculations for the first several (e.g. 
10} characters use successive ones of a set of initial 
25 variables: preferably the calculations for each subsequent 
character uses, instead of an initial variable, the result of 
the calculation in respect of a previous^ character. 

Preferably each calculation also uses one of a 
predetermined set of prime numbers. Preferably each 
3 0 calculation uses an interim result to determine which of these 
prime numbers is used to complete the calculation. 

Preferably the processing involves at least a second 
pass over the selected part or parts of the text: in other 
words, once the calculation for the last character is 
35 completed, a second series of successive' calculations is 
carried , out on the characters, typically starting with the 
first character, and using the results of the calculations of 
the first series. 

At the end of the above-described processing, the hash 
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is formed by taking selected digits from the results obtained 
in a final plurality of the calculations: for example the 
final two digits may be taken from each of the final 10 
results, and a 20-digit hash formed by placing these 10 pairs 
5 of digits in a given order. 

One form of hash algorithm used in the invention is an 
Objective Linguistic Hash (OLH) . This is linguistic in that 
it "reads" letters, numbers and other keys commonly used in the 
preparation of documents. It is objective in that the hash 

10 value produced can be verified by anyone using the algorithm. 
The OLH algorithm produces a final number by acting recursively 
one character at a time throughout the length of the message. 

The variability of the message far exceeds the 
variability of the final hash, so inevitably many different 

15 messages would have the same hash value. However, it is 
unfeasible to make a meaningful change to the message whilst 
retaining the same hash number. 

It will be appreciated that the invention may be 
incorporated in a word processing apparatus. In this use, a 

20 document is created in electronic form on the apparatus, 
complete with the seal (e.g. series of tilde marks) at the 
beginning and end of the or each selected part of the text. 
A "sealing" command is then performed, whereupon the apparatus 
automatically processes the "sealed" part or parts of the text 

25 to create the hash, which is stored with the text. 
Subsequently, the document can be altered or corrected as 
necessary, then "re-sealed", to process the sealed part or 
parts of the text again and create the hash afresh. Once the 
document is finalised, it can be printed out, complete with the 

30 hash. 

The above-mentioned OLH algorithm may be modified to 
provide a Subjective Linguistic Hash (SLH) . This differs from 
the OLH in that it is made subjective by being "seeded" with 
secret information known only to an accredited authority: 
35 thus, the processing of the selected or "sealed" part or parts 
of the text is carried out using secret initial variables. 
Preferably use is made of a seed, in the form of a very large 
secret number (typically having 50 to 200 digits) known as the 
Secret Primitive (SP) . An algorithm is run, using the SP, to 
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produce the initial variables: preferably this algorithm also 
uses a number of items of open information, known as Open 
Primitives (OP's), contained in the document or message being 
protected. The SLH algorithm may produce a plain hash 
5 initially, then encrypt this using the SP as secret key: this 
preserves the secrecy of the plain hash. 

A further algorithm which can be used in accordance 
with the invention is a Subjective Encrypted Hash (SEH) 
algorithm. This involves encrypting an OLH hash, using secret 
10 primitive values known only to a witnessing party, together 
with open primitive values such as date and time. In this 
case, the witnessing party uses an apparatus into which the OLH 
of a document or message is keyed, together with the open 
primitive values, and which encrypts the OLH using the SEH 
15 algorithm, to create the SEH hash which is preferably printed 
on the document, or on a label for application to the document. 
Preferably the apparatus stores the initial OLH and the final 
SEH, together with the open primitive values. 

Embodiments of the present invention will now be 
20 described by way of examples only and with reference to an 
accompanying table, which is a worked Example to illustrate the 
use of an OLH algorithm. 

The accompanying Example uses an OLH algorithm on a 
selected part of the text of a document, typically a word 
25 processed documents, namely the part between the two series of 

five tilde marks ( ). The hash algorithm uses a set of 

initial values or variables (IV f s), in this example 
2,4,8,16,32,64,128,256,512 and 1024: .the algorithm 

additionally uses a set of 64 prime numbers (each of 5 digits) 
30 used as modulators and also three prime numbers, preferably 
37,17 and 7, as will be shown below. The table shows the 
processing carried out to create a 20-digit hash. The 
following shows the manner in which the calculations proceed, 
taking for example the 16th row: it will be noted that the 
35 part of the message to be processed is set out character-by- 
character in the first column: the rows are numbered 0 to 9 
cyclically (starting with 1) in the second column; the initial 
variables are used in turn in the 5th column, for the first 10 
rows. 
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Thus, reading across the 16th row, we have: 



10 



15 



20 



25 



30 



35 



s 
6 

115 
25149 

16002 
41266 



16 

5405846 
37 



27299 



644 



= the input character 
= the n y n value, i.e. the choice of 
recursive P(y) 
the ASCII value of M s" 
= the value of the result on the 

preceding row, namely P(5) = 25149 
= the last value of P(y) , i.e, P(6) 

115 + 25149 +16002 (the sum of the 
values in the three preceding columns 
in the same row) 
= the value of n, where n is the 

ordinal of the character in the text 
41266 x (115 + 16) 
= the value of 2 

= (37x115+17x16+7x25149+30539) mod 64 
(using the prime numbers 37,17,7) 
= the value of the 37th of the set of 
64 5 -digit prime numbers 
5405846 mod 27299 
It will be noted that the calculation on each row is 
recursive, in that it uses results produced on previous rows 
(see the 4th and 5th items in each row) . Further, in the 
example shown, the algorithm makes a second pass over the 
sealed part of the text. Finally, the 20-digit OLH hash is 
produced by selecting the final two digits of the results 
(final column) of the final 10 rows, placed in the order of 
recursive p(y) = 0 to 9. 

Any attempt to alter the sealed part of the text, 
whilst retaining the same hash value, would require subsequent 
alterations in all further recursive steps to the end of that 
text. This is inherently difficult, but made more so by 
continuing the recursion back to the beginning of the sealed 
text for the second pass: a third pass may additionally be 
made. 

The above-described OLH algorithm may be modified to 
form a Subjective Linguistic Hash (SLH) • The SLH differs from 
the OLH in that it is made subjective by being "seeded" with 
secret information known only to an accredited authority: the 
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initial values (IV 1 s) are therefore secret. Preferably the 
seed is a very large secret number, typically with 50 to 200 
digits, and known as the Secret Primitive (SP) , known only to 
the issuing authority* The SLH algorithm "fuses" the SP with 
5 open information, known as Open Primitives (OP's) , contained 
in the document or message, to produce the initial variables 
(IV's). Preferably the algorithm produces a "plain hash" in 
the first instance, which is then doubly encrypted using the 
SP as secret key. This preserves the secrecy of the plain hash 
10 and makes it mathematically unfeasible to work backwards 
through the document to discover the primitives. 

A further algorithm which can be used is a Subjective 
Encrypted Hash (SEH) . The SEH involves encrypting an OLH hash. 
The encryption incorporates secret primitive (SP) values known 
15 only to a witnessing party, and open primitive (OP) values such 
as date and time or other non-repeating factors. Further , the 
encryption is one-way, because the OLH is also fused with the 
OP and SP values. Since the key is therefore part of the 
message, the crypt cannot be reversed by application of the 
20 key. Every output value of a fixed OLH is therefore distinct, 
due to non-repeating elements in the OP's. 

A number of possible applications of the invention will 
now be described by way of examples only. 

A first application of the invention is for preventing 
25 fraudulent alteration of a Vehicle Registration Document. It 
is well known that stolen or redundant Vehicle Registration 
Documents have a value in the process of "ringing", that is f 
altering the identity of a stolen car. To complete the fraud, 
a plausible Vehicle Registration Document is required. In a 
30 first case, the ringer will have to make a forged alteration 
to the document, for example, to cover a re-spray in a 
different colour. In a second case, if the ringer can alter 
the identity of a car, exactly to match the Vehicle 
Registration Document, then the fraud is undetectable to an 
3 5 unsuspecting buyer. 

The present invention can prevent fraud in either case 
in the following way. When a vehicle is insured, the important 
fixed elements of the particular information concerning the 
vehicle and its keeper form the message parts of the hash 



CBSD00J9Q0 



7 

algorithm. The secret primitives are in the possession of the 
insurer of the vehicle. 

Example of message parts are: 

Owner, Keeper, Registration Number, Make, 
5 Model, Colour, Chassis Number, Engine number. 

These parts are impossible to alter in a fraudulent 
way, without knowledge of the corresponding altered value of 
the SLH. Thus an SLH hash marked on the Registration Document 
protects against the first case of fraud. To solve the problem 
10 of the second case, OP's are added as follows: 

Insurance Renewal Date, Mileage on last 
insurance, Stated value on last insurance. 
These OP's have to be altered in the second case to 
give a vehicle a new false history. It is not possible for a 
15 ringer to do this because the true history is protected by 
earlier SLH's. 

The SP for a given Insurance Company or other authority 
would preferably be a very large number, typically of 50 to 200 
digits. It is preferably that the insurance company produces 
20 an updated SLH each year, using details of the vehicle and its 
keeper held or added to its stored record for that client, and 
including the vehicle mileage: the SLH may then be printed. 

In a variation applicable to a vehicle registration 
document, the insurance company may produce an SLH each year, 
25 using details of the vehicle and its keeper, including the 
vehicle mileage: the SLH is then printed on a sticker, 
together with open information of the vehicle (e.g. mileage, 
value of the vehicle) for the keeper to stick on the vehicle 
registration document. Each time the insurance is renewed, an 
30 additional such sticker is created for the keeper to add to the 
registration document. It will be appreciated that the 
registration document will thus include, in respect of each 
renewal, a hash related to data printed in selected parts or 
fields of the document. 
35 A second application of the invention is relevant to 

high value tickets, bought in advance where there is high risk 
of fraud. This form of fraud is rife for example in the sale 
of tickets for long-awaited pop concerts where the forged 
tickets are sold to young people in a social context where they 
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are likely to be susceptible. Nothing can prevent a buyer from 
purchasing a ticket where there is no ready means of verifying 
its data, but with a suitable warning this application of the 
invention exerts psychological pressure due to the uncertainty 
that a ticket bought from an unofficial source will be valid 
on the day of the concert. A suitable warning might read as 
follows: 

Warning: If you have bought 
this ticket from an 
unauthorised source, it may be 
a perfect forgery. Only 
genuine tickets will pass the 
electronic test at the 
turnstile. Do not run the risk 
of being turned away. 
Each event is given an SP which is available as an 
input to the software used at legitimate outlets. This SP is 
only released to points of entry to the concert immediately 
before the crowds start to appear. The point of entry has a 
machine for reading the hash from the ticket: the hash may be 
printed on the ticket, at the time of issue, in both human- 
readable and machine-readable form. The OP is a combination 
of the date and time of sale, correct to the nearest second, 
and the name of the buyer. The SLH is also printed on the 
ticket. Even if the fraudster pr:'.nts a very recent time and 
date, it is mathematically unfeasible to calculate the 
appropriate SLH, so he has a hazardous task of persuading the 
buyer that he/she must attach no significance to the lapse of 
time. Further, the buyer who reads the warning on the reverse 
side of the ticket is put under the psychological pressure of 
having to wait for the concert itself before knowing whether 

the ticket is valid. 

A third application of the invention is relevant to 
National Identity Cards which display a photograph and personal 
details of the legitimate owner. The invention provides for 
a massive SP (containing at least 400 figures) held in a tamper 
proof location. The printed matter of the card is classified 
either as message parts to be hashed or OP's. The SLH is 
printed on the face of the card as additional information. 
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This prevents alteration of a card or the printing of a false 
identity. 

A fourth application of the invention is the use of a 
Trusted Third Party such as an accredited Notary Public to 
5 provide an SEH supplied with a pre-calculated OLH for a 
"sealed" part of a document. The document itself may either 
be sent in plain or in crypt. The function of the notary is 
to use the OLH to calculate the SEH. The document may be 
processed to provide it with a double header, the OLH and the 
10 SEH which incorporates a date/time stamp. In the event of a 
dispute both '•versions" of the disputed text can be tested by 
an OLH, but only the valid OLH will have the proper SEH. 
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1) An apparatus which is arranged to process a selected 
part or parts of the text of a document or message to form a 
hash, the hash usually being of fewer characters than the 
selected part or parts of the text, the processing comprising 
retrieving numerical values which define the respective 
characters of the selected part or parts of the text and making 
a calculation using the numerical values of the successive 
characters . 

2) An apparatus as claimed in claim 1. which is arranged 
to receive or create said text in electronic form, then process 
said text to derive said hash. 

3) An apparatus as claimed in claim 2, arranged to add 
said hash to said text. 

15 4) An apparatus as claimed in claim 3, arranged to output 

said text, with the added hash. 

5) An apparatus as claimed in claim 2 , arranged to output 

said text and its hash separately, or to store one and output 
the other. 

20 6) An apparatus as claimed in any preceding claim, 

arranged for the or each said part of said text to be 
identified by predetermined characters or combinations of 
characters immediately preceding and following it. 

7) An apparatus as claimed in claim 6, in which each said 
25 identifier comprises a series of tilde marks. 

8) An apparatus as claimed in any preceding claim, in 
which said numerical values of the respective characters of the 
selected text are their ASCII values. 

9) An apparatus as claimed in claim 8, in which an 
30 alphabet which includes all said characters is restricted to 
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all keystrokes having ASCII values in the range 32 to 125 
inclusive. 

10) An apparatus as claimed in any preceding claim, 
arranged so that said processing of said selected part or parts 

5 of said text comprises recursive processing, in that the 
calculation in respect of each character uses the result of the 
calculation made in respect of at least one previous character. 

11) An apparatus as claimed in claim 10, arranged so that 
the calculations made for a first plurality of characters use 

10 successive ones of a set of initial variables. 

12) An apparatus as claimed in claim 10 or 11 , arranged so 
that each said calculation also uses one of a predetermined set 
of prime numbers. 

13) An apparatus as claimed in claim 12, arranged such that 
15 each said calculation uses an interim result to determine which 

of said prime numbers is used to continue the calculation. 

14) An apparatus as claimed in any preceding claim, 
arranged so that said processing involves at least a second 
pass over the selected part or parts of said text. 

20 15) An apparatus as claimed in any preceding claim, 
arranged so that at the end of said processing, the hash is 
formed by taking selected digits from the results obtained in 
a final plurality of said calculation. 

16) An apparatus as claimed in any preceding claim, 
25 arranged such that said hash is seeded with secret information. 

17) An apparatus as claimed in claim 16, arranged such that 
said processing is carried out using secret initial variables. 

18) An apparatus as claimed in any preceding claim, 
arranged to encrypt said hash. 
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19) An apparatus as claimed in claim 18, arranged to store 
said hash and the encrypted hash formed from it. 

20) An article carrying information in the form of printed 
or electronic text and also carrying a hash formed from a 

5 selected part of parts of said text, the hash usually being of 
fewer characters than the selected part or parts of said text 
and formed by making a calculation using numerical values which 
define the respective characters of said selected part or parts 
of said text. 

10 21) A process of forming a hash from a selected part of 
parts of the text of a document or message, the process 
comprising retrieving numerical values which define the 
respective characters of the selected part or parts of said 
text and making a calculation using the numerical values of the 

15 successive characters, said hash usually being of fewer 
characters than said selected part or parts of the text. 
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